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The model of Big Five personality factors is currently the most widely 
known and the most frequently used taxonomy of personality traits. The model is 
usually referred to as the Big Five or the Five-Factor Model. Although these 
names are sometimes used interchangeably, they have their roots in two different 
research traditions. One of these is the lexical tradition and the other one is the 
tradition of questionnaire research (De Raad & Perugini, 2002; John & Srivasta- 
va, 1999; cf. Siuta, 2009). 

The questionnaire tradition gave rise to commonly known and widely used 
questionnaires for measuring five personality traits: the NEO-PI-R (NEO Perso- 
nality Inventory Revised) and the NEO-FFI (NEO Five-Factor Inventory), 
developed by Costa and McCrae (1992). The NEO-PI-R was adapted into Polish 
by Siuta (2009), and the Polish adaptation of the NEO-FFI was prepared by Za- 
wadzki, Strelau, Szczepaniak and Śliwińska (1998). The lexical tradition also 
developed measures known in the literature worldwide, but they have not been 
commonly used in Poland so far. Filling this gap is the primary aim of the 
present article, which is a presentation of the Polish adaptation of Goldberg's 
(1992) IPIP-BFM-50 (i.e., the 50-item Big Five Markers questionnaire from the 
resources of the International Personality Item Pool). It consists of sentences 
(just like questionnaires in the psychometric tradition) but serves to measure the 
Big Five as identified in lexical research. 

A majority of measures in the questionnaire tradition are available today in 
the form of licensed commercial questionnaires. They are used not only in scien- 
tific research but also in individual assessment. It is partly because of the diag- 
nostic character of these measures that they are under special protection (e.g., it 
is forbidden to publish or modify test items, and the questionnaires may be used 
almost exclusively by authorized assessment psychologists), one of the conse- 
quences being that these questionnaires are available at a charge. 

In Poland, there is a lack of noncommercial instruments measuring five per- 
sonality traits that would have good psychometric properties while having been 
created for research purposes (not for individual assessment), even though there 
are many such instruments in the literature worldwide. Filling this gap is the 
second aim of the present paper. The IPIP-BFM-50 questionnaire is part of the 
resources of the International Personality Item Pool (IPIP; Goldberg, 1999; 
Goldberg, Johnson, Eber, et al., 2006), being a collection of test items and ques- 
tionnaires available for researchers free of charge, without usage restrictions 
typical for licensed commercial measures. A consequence of the questionnaires 
from IPIP resources being designed for research is the lack of norms that usually 
accompany licensed measures serving also for diagnostic examinations. This is 
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also the case for the IPIP-BFM-50 questionnaire, which we have adapted in the 
research presented below. 

It is worth adding that this questionnaire is referred to in the literature by 
various names: IPIP Big Five (Zheng, Goldberg, Zheng et al., 2008), Goldberg’s 
IPIP 50 (Guenole & Chernyshenko, 2005), or IPIP Five-Factor Model (Donnel- 
lan, Oswald, Baird, & Lucas, 2006). Since there is no single established name, 
we propose IPIP-BFM-50, referring to the 50-item Big-Five Markers from the 
resources of the International Personality Item Pool. The name we propose ap- 
pears to be the most precise one because it points to three key properties of this 
questionnaire: 1) its origination in the IPIP project, 2) its direct link with the 
lexical tradition, and 3) the version of the measure, a 50-item version in this case, 
whereas IPIP resources contain a 100-item version of the BFM questionnaire as 
well (Strus, Cieciuch, Davidov et al., 2013). 


Five Personality Traits 
in the Lexical Tradition 


The Five-Factor Model was originally developed and then verified within the 
lexical research tradition. Partly independently and partly on the basis of lexical 
research, the psychometric approach in research on the five factors of personality 
came into being, in which the model was theoretically elaborated and embedded 
in a broader theory of personality. 

The key to the psycholexical approach is the so-called lexical hypothesis. It 
assumes that the most important individual personality differences have been 
encoded in the form of individual terms in some or in all of the world's languages 
(cf. Goldberg, 1981, 1990). This idea inspired a number of studies, initially con- 
ducted mainly on the English lexicon, which led to the identification and mul- 
tiple replication of the structure of five big personality factors. Initiated by All- 
port and Odbert as well as by Cattell and developed by Fiske, Tupes, and Christal 
as well as by Norman (John & Srivastava, 1999), this research current was con- 
tinued in the 1980s and 1990s by Goldberg. Goldberg carried out a series of stu- 
dies (Goldberg, 1981, 1990, 1992) that made him the leading figure of the lexical 
approach. It was also Goldberg (1981) that introduced the term Big Five itself. 
The five personality traits distinguished in this tradition are the following: Extra- 
version or Surgency (Factor I), Agreeableness (Factor II), Conscientiousness 
(Factor III), Emotional Stability (Factor IV), and Intellect or Imagination (Factor V). 
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Adjective-Based Measurement 
of the Five Traits in the Lexical Tradition 


The main aim of the lexical tradition was to describe the structure of perso- 
nality encoded in language, understood as a set of independent factors. Various 
lists of adjectives were used, though they served as material for lexical research 
rather than as instruments for measuring any particular constructs (cf. Goldberg, 
1990, 1999). However, when that aim had been achieved and personality factors 
had been identified, instruments for measuring them also began to appear. 

These instruments traditionally consisted of adjectives used in self-report and 
observer-rating studies. The most frequently used measures in this tradition in- 
clude two versions of Goldberg’s (1992) adjectival Big Five Factor Markers 
(BFM) and Saucier’s (1994) Mini-Markers. The first version of BFM consisted 
of 100 adjectives, rated on a 9-point scale. The second version of BFM consisted 
on 50 pairs of opposing adjectives. Saucier’s Mini-Markers (1994) is a 40-adjec- 
tive version of the 100-adjective BFM measure. 

All these measures were developed on the basis of the English language, in 
which the Big Five was identified. At the same time, lexical research was carried 
out in languages other than English, too (cf. e.g., De Raad, Perugini, Hrebickova, 
& Szarota, 1998; Gorbaniuk, Budzinska, Owczarek, Bozek, & Juros, 2013). 
Sometimes it also resulted in measures being developed. In Poland, Szarota 
(1995) created the Polish Adjective List for measuring five personality traits 
identified in Polish lexical studies. Still, these measures were only used locally 
because they were designed for measuring personality traits identified in a par- 
ticular local language and within local culture. 


Sentence-Based Measurement 
of the Five Traits in the Lexical Tradition 


In the lexical tradition, measures had the form of adjective lists. By contrast, 
from the very beginning, measures in the questionnaire tradition had the form of 
sentence sets. Either form involves considerable problems. Adjectives as items in 
measures of personality represent behaviors on a high level of abstraction. They 
are very general, imprecise, and often ambiguous; they do not take into account 
the context or the motivational aspect; they also make up a finite set (Jarmuz, 
1994; Saucier & Goldberg, 2002). 

Sentences are more semantically specific and embedded in a context and 
may take motivation into account, but in the case of commonly used question- 
naires such as the NEO-PI-R or the NEO-FFI they are often rather long and 
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complicated. As a result, their understanding is also often ambiguous and de- 
pends on participants” verbal skills. For this reason, in the IPIP project, of which 
the IPIP-BFM-S0 is part, Goldberg (1999) adopted a solution that overcomes — at 
least partly — the shortcomings of both adjectives and sentences as items in per- 
sonality measures. 

The IPIP-BFM questionnaire measures the five personality factors identified 
in the lexical tradition (Goldberg, 1992), but its items are sentences. At the same 
time, the form of these sentences differs from that of the sentences usually mak- 
ing up measures in the questionnaire tradition. This is because IPIP items follow 
the format developed by Hendriks, Hofstee, and De Raad (1999). Its essence lies 
in the items being short sentences formulated in behavioral terms. 

The IPIP-BFM was developed on the basis of a study in which participants 
rated themselves using, among others, the 100-adjective version of BFM and 
responded to the pool of 1252 sentence items from IPIP resources (Goldberg, 
1999). The items for the sentence-based version of BFM were selected on the 
basis of correlations with factors obtained in a study using the adjective-based 
version of BFM. This manner of selecting items allowed to avoid arbitrary 
choice of sentences and at the same time made it possible to avoid ambiguity, 
characteristic of adjective-based scales. 

Two versions of the IPIP-BFM were thus developed: the 100-item version 
(IPIP-BFM-100) and the 50-item version (IPIP-BFM-50). All the items of the 
IPIP-BFM-50 are present in the IPIP-BFM-100, and correlations between the 
scales of the two versions ranged from .94 to .96 (Saucier & Goldberg, 2002). 


Five Personality Factors 
in the Lexical and Questionnaire Traditions 


Despite differences in the conceptualization of the five personality factors 
between the lexical and questionnaire traditions, a far-reaching correspondence 
exists between the two models — both in the theoretical meaning of the factors 
and in the empirical research conducted (cf. Biderman, Nguyen, Cunningham 
et al., 2011; Goldberg, 1992; John & Srivastava, 1999; McCrae & John, 1992). 

The three most important differences between Goldberg’s (1992) Five-Factor 
Model, which is the most widely known model in the lexical tradition, and Cos- 
ta’s and McCrae’s model (1992), which is the most widely known model in the 
questionnaire tradition, are the following: (1) the meaning of Factor V: Intellect 
in the lexical tradition comprises a more narrow range of personality properties 
than Openness to Experience in the Five-Factor Model in the questionnaire tradi- 
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tion; (2) the placing of the warmth trait — in the Five-Factor Model in the ques- 
tionnaire tradition it is a facet of Extraversion, whereas in the lexical model it 
falls into Agreeableness (John & Srivastava, 1999); (3) the name of Factor IV: 
the factor is understood in similar ways in the two models, but it is named after 
the Emotional Stability pole in the lexical approach and after the Neuroticism 
pole in the questionnaire tradition. 

The IPIP-BFM-50 is an instrument for measuring the Big Five in Goldberg’s 
(1992) lexical approach, corresponding to the five factors of personality in the 
questionnaire tradition. In the lexical approach, factors are traditionally described 
by the adjectives that have the highest loadings on them. In Table 1 we propose 
descriptions of variables measured by the IPIP-BFM-50 questionnaire. Those 
descriptions are based on Goldberg’s (1992) list of 100 best lexical markers of 


the Big Five, namely on the Big Five Factor Markers. 


Table 1 


Description of the Five IPIP-BFM-50 Scales 


Individuals who score high 


Individuals who score low 


Scale Object of measurement ; . 
J may be described as: may be described as: 
The level of activity, 
energy, as well as socia- active, energetic, extra- RATED WORA 
Extraversion bility and social self- verted, talkative, bold, and KS ste 
: : and socially inhibited. 
confidence (assertive- assertive. 
ness). 
a z trustful, kind, considerate distrustful, selfish, unkind, 
Positive (vs. negative) : 
Agreeableness : and warm as well as coop- rude, and emotionally cold 
attitude towards people. | 
erative and helpful. towards other people. 
The level of organiza- 
tion, diligence in pur- organized, diligent, tho- unsystematic and inconsistent, 
iss suing goals and perform- rough and efficient in what unconcerned with order and 
Conscientiousness 


Emotional Stability 


Intellect 


ing tasks as well as 
proneness to order and 
dutifulness. 


The level of reactivity 
and emotional stability, 
emotional resistance and 
tolerance to frustration. 


Intellectual openness, 
creativity, and imagina- 
tion. 


they do as well as systemat- 
ic and dutiful. 


imperturbable, calm, re- 
laxed, not prone to negative 
emotional states. 


intellectually active and 
cognitively open, creative, 
introspective, having 

a vivid imagination and 

a wide range of interests. 


planning, negligent, careless, 
and undependable. 


anxious, nervous, moody, 
prone to worry and oversensi- 
tive as well as envious, touchy, 
prone to anger and irritation. 


unintellectual, noninquisitive, 
unimaginative, simple, unso- 
phisticated, unreflective and 
uncreative. 
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Hypotheses 
in the Present Study 


When adapting the measure into Polish, we formulated the following expec- 
tations: 

1. We expected the scales to have satisfactory reliability. We verified this 
expectation by analyzing the values of Cronbach’s alpha. 

2. We expected the questionnaire to have a five-factor structure. We verified 
this expectation by performing confirmatory factor analysis. 

3. We expected the IPIP-BFM-50 to be a measure unaffected by various re- 
search conditions. We verified this expectation by performing two measurement 
invariance tests. The first test verified the invariance between the arrangement of 
test items in the questionnaire and an arrangement mixed with a pool of other test 
items. The other test verified measurement invariance between two research con- 
ditions: paper-and-pencil study and online study. 

4. We expected satisfactory external validity. In verifying this expectation, 
we assumed that Goldberg’s (1992) lexical Big Five was similar to the five fac- 
tors of personality distinguished by Costa and McCrae (1992). We performed 
a verification of expectations using two techniques. The first was the analysis of 
correlations with the measurements of five personality traits in the questionnaire 
tradition. The other was the comparison of results concerning the differentiation 
of personality traits by gender and age with the results reported in the literature, 
obtained using measures developed in the questionnaire tradition. 


METHOD 


Participants and Procedure 


We conducted a series of eight studies, of which seven were carried out using 
the paper-and-pencil method and one in online conditions. Those studies were 
carried out as part of several research projects concerning various aspects of 
personality, its structure and development. 

Participation was voluntary and anonymous for everyone. The first, second, 
fifth, and seventh studies were carried out by trained students, who volunteered 
to help. Each student carried out a study in a group of 5 to 10 people. The second 
and third studies were conducted on a group basis, at schools and universities, by 
trained research assistants. The sixth study was carried out using the paper-and- 
pencil method by Magdalena Leśniewska as part of her master’s thesis research. 
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The eighth study was carried out online by the authors and by students as part of 
their master’s thesis research. Participants were recruited via Facebook. 

Data were collected from a total of 7127 people, but the analyses presented 
below were performed on a group of 7015 participants, 112 individuals (1.5%) 
having been excluded due to outliers or missing data. We adopted 10% of un- 
answered test items as the threshold. The scores of individuals with the number 
of missing data equal to or higher than this threshold were excluded from further 
analyses. Table 2 gives the number of individuals analyzed in each study, togeth- 
er with information about their gender and age. 


Table 2 
Age, Gender, and the Number of Participants in Study Groups 
Age 
Group N (% of women) 
M SD 
Study 1 936 (54.9) 30.78 13.68 
Study 2 685 (48.2) 31.10 11.39 
Study 3 304 (50.5) 18.00 0.13 
Study 4 414 (34.6) 22.02 3.48 
Study 5 679 (34.3) 27.20 12.30 
Study 6 861 (60.4) 38.94 14.44 
Study 7 789 (56.7) 29.65 12.26 
Study 8 2347 (58.3) 27.23 9.85 
Total 7015 (52.9) 28.93 12.16 
Measure 
IPIP-BFM-50 test items 


The IPIP-BFM-50 questionnaire was developed as part of Goldberg’s IPIP 
project (Goldberg, 1999; Goldberg, Johnson, Eber et al., 2006), which also in- 
cludes other instruments for measuring personality traits. One of them is the 
IPIP-45AB5C questionnaire, measuring 45 variables of the AB5C model (Ab- 
ridged Big Five Dimensional Circumplex), developed by Hofstee, De Raad, and 
Goldberg (1992). Strus, Cieciuch, and Rowiński (2014) adapted this measure 
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into the Polish language. The measure consists of 486 items, including 48 items 
of the IPIP-BFM-50. In the first study we used the IPIP-45AB5C questionnaire, 
extended by the missing two items of the IPIP-BFM-50. The study using the 
IPIP-45AB5C was carried out in two rounds with a two-week interval between 
them. 


The IPIP-BFM-50 questionnaire 


In all the successive studies, the IPIP-BFM-50 was used as a separate meas- 
ure, consisting of 50 items. Each of the scales (Extraversion, Conscientiousness, 
Agreeableness, Emotional Stability, and Intellect) consists of 10 items. Partici- 
pants respond to the items on a 5-point Likert scale (very inaccurate) to 5 (very 
accurate). The items were translated by the authors of the present paper. In the 
process of translation, efforts were made to ensure both linguistic fidelity to the 
original and theoretical correspondence between the items and the constructs 
measured. 


The NEO-PI-R and the NEO-FFI 


In three studies we also applied measures of personality traits developed in 
the questionnaire tradition. In a subgroup of N = 685 participants in the second 
study, the measure used was the NEO-PI-R by Costa and McCrae (1992) as 
adapted into Polish by Siuta (2006). This questionnaire was applied in a separate 
research session, carried out about two weeks after the study using the IPIP- 
BFM-50. In the seventh study, the measure applied was Costa’s and McCrae’s 
(1992) NEO-FFI as adapted into Polish by Zawadzki and colleguess (1998), 
which was administered to N = 782 individuals. 


RESULTS AND DISCUSSION 


Descriptive Statistics 


Table 3 presents descriptive statistics for each of the five traits measured by 
the IPIP-BFM-50. The distribution of scores for each scale is close to normal 
distribution. Skewness and kurtosis range between -1 and +1. A little deviation 
was only found in the case of the Agreeableess scale in the fourth study. 
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Table 3 


Reliability and Descriptive Statistics of the Questionnaires Scales for Each Study Group 


Scale Study group M SD Skewness Kurtosis ŚP: 
Study 1 3.30 0.72 -0.041 -0.401 „86 
Study 2 3.36 0.71 -0.092 -0.191 .87 
Study 3 3.46 0.73 -0.199 -0.465 „86 
Study 4 3.35 0.70 0.027 -0.039 82 
SASZ: Study 5 3:35 0.75 -0.103 -0.442 .89 
Study 6 3.39 0.70 0.126 -0.333 „83 
Study 7 3.49 0.78 -0.211 -0.466 91 
Study 8 3.31 0.81 -0.151 -0.390 90 
Study 1 3.91 0.52 -0.505 0.169 .79 
Study 2 3.86 0.55 -0.283 -0.179 .82 
Study 3 3.70 0.57 -0.069 -0.631 .79 
Study 4 3.81 0.65 -0.096 -1.128 .79 
as Study 5 3.93 0.54 -0.432 -0.003 .82 
Study 6 3.86 0.63 -0.250 -0.694 .82 
Study 7 3.98 0.50 -0.484 0.152 .79 
Study 8 3.84 0.63 -0.611 0.370 „84 
Study 1 3.46 0.63 -0.145 -0.321 .80 
Study 2 3.58 0.63 -0.297 -0.137 82 
Study 3 3.14 0.64 0.012 -0.184 .80 
Study 4 33 0.63 0.186 -0.423 .75 
aa Study 5 3.49 0.61 -0.100 -0.241 .79 
Study 6 3.62 0.61 -0.252 -0.061 .75 
Study 7 3.47 0.65 -0.086 -0.564 „83 
Study 8 3.43 0.68 -0.138 -0.298 „83 
Study 1 2.98 0.76 -0.018 -0.365 .87 
Study 2 3.11 0.79 -0.252 -0.368 90 
Study 3 3.02 0.80 -0.051 -0.547 88 
Study 4 3.05 0.79 0.030 -0.258 .86 
Pte Orne Study 5 2.93 0.72 -0.121 -0.181 87 
Study 6 3.15 0.76 0.180 -0.016 .86 
Study 7 3.19 0.80 -0.205 -0.551 .90 
Study 8 3.01 0.84 -0.048 -0.558 .90 
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Scale Study group M SD Skewness Kurtosis AR 22 
Study 1 3.49 0.57 -0.099 -0.231 AT 
Study 2 3.46 0.55 0.092 -0.115 .78 
Study 3 3.59 0.58 -0.016 -0.450 .78 
Study 4 3.56 0.53 0.239 -0.334 .70 

ZA Study 5 3.56 0.57 0.164 -0.147 „80 
Study 6 3.41 0.57 0.164 -0.258 MEJ 
Study 7 3.61 0.59 -0.087 -0.506 „82 
Study 8 3.73 0.58 -0.218 -0.155 .78 


Measurement Reliability 


The values of Cronbach's a coefficient obtained in each study for the five 
scales of the IPIP-BFM-50 are presented in Table 3. They range between .73 
(Intellect in the sixth study) to .91 (Extraversion in the seventh study). Discrimi- 
nating power ranges from .39 (Intellect in the seventh study) to .67 (Extraversion 
in the seventh study). The mean values of Cronbach’s a, computed on the basis 
of all the eight studies, are the following: .87 for Extraversion; .81 for Agreea- 
bleness; .80 for Conscientiousness; .88 for Emotional Stability, and .77 for Intel- 
lect. Measurement reliability may therefore be called very high. 


Factorial Validity 


Factorial validity was verified in confirmatory factor analysis, performed us- 
ing Amos 21 software. Because of the large number of test items measuring 
a given trait, we applied the parceling procedure. It consisted in group means 
instead of individual items being entered as observable variables (Little, Cun- 
ningham, Shahar, & Widaman, 2002). In our research, we divided items making 
up each scale into three parcels randomly. Figure 1 presents the tested model, 
with factor loadings and correlations between latent variables in the aggregate 
group from all studies (N = 7015). 
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E_PARCEL_1 

E_PARCEL_2 Extraversion 

E_PARCEL_3 

A_PARCEL_1 

A_PARCEL_2 Agreeableness 

A_PARCEL_3 

C_PARCEL_1 

(#)}—4 C_PARCEL_2 Conscien- 44 
tiousness 

C_PARCEL_3 

ES_PARCEL_1 

© ES_PARCEL_2 Emocional 

(w) Stability 

ES_PARCEL_3 

I_PARCEL_1 

I_PARCEL_2 

I_PARCEL_3 


Figure 1. The measurement model of the IPIP-BFM-50 questionnaire with factor loadings and 
correlations between latent variables in the aggregate group from all studies (N = 7015). 


The assessment of model fit to data was based on RMSEA, CFI, and SRMR 
indices. RMSEA and SRMR below .08 and CFI above .9 are adopted as thre- 
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shold values of model acceptability (Hu & Bentler, 1999; Marsh, Hau, & Wen, 
2004). 

Table 4 presents model fit indices for each of the eight studies and for all the 
data taken together. In the fourth study, the model was found to fit the data poor- 
ly. In the fifth study, RMSEA is higher than .08 and in the seventh study is equal 
to that value. SRMR falls within the acceptable range in all of the studies and 
CFI falls within that range in all except the fourth one. All the three goodness-of- 
fit indices obtained for the whole sample fall within the limits of acceptability 
according to the criteria given above. Taking into account the specificity of mea- 
surement and the problems, reported in the literature, with confirmatory factor 
analysis of questionnaires for measuring the Big Five as well as the values of fit 
indices obtained for the whole sample, the obtained results may be regarded as 
satisfactory verification of the five-factor structure of the IPIP-BFM-50 ques- 
tionnaire. 


Table 4 
Fit Indices of Confirmatory Factor Analysis for Each Study Group (af = 80, 4” was significant in 
all studies)! 


Group x CFI RMSEA (90%) SRMR 
Study 1 396.96 „944 .065 [.059-.072 „053 
Study 2 377.09 „942 .074 [.066-.081 057 
Study 3 217.38 927 .075 [.063-.087 .057 
Study 4 390.99 890 .097 [.088-.107] 078 
Study 5 446.09 926 .082 [.075-.090 058 
Study 6 505.18 926 .079 [.072-.085 „052 
Study 7 488.06 936 „080 [.074-.087 062 
Study 8 1146.23 „943 „075 [.072-.079 „056 
The entire sample 3193.33 „938 „074 [.072-.077 054 


Table 5 presents intercorrelations between latent variables from confirmatory 
factor analysis (below the diagonal) and observable variables, computed from the 


' Information concerning standardized parameters of the model and the key to the test items 
making up each scale are available from the authors upon request. 
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key (above the diagonal). The results presented in Table 5 were obtained in the 
analysis of the entire sample. 


Table 5 

Intercorrelations Between Scales of the IPIP-BF M-50 for the Entire Sample (N = 7015). Below the 
diagonal, the table shows correlations between latent variables in confirmatory factor analysis, 
and above the diagonal — correlations between observable variables are shown (all the corre- 
lations are significant at p < .01) 


Scales E A c ES I 
E (Extraversion) 1 34 .09 27 .36 
A (Agreeableness) 44 1 27. 10 .27 
C (Conscientiousness) 14 35 1 16 „07 
ES (Emotional Stability) 33 10 18 1 .09 
I (Intellect) 44 .36 12 13 1 


Measurement Invariance 
Between Different Research Conditions 


Two measurement invariance tests were applied in different research condi- 
tions. The first one tested measurement invariance between the following two 
situations: (1) the measurement of five traits using the IPIP-BFM-50 question- 
naire in one research act and (2) the measurement of five traits by means of items 
from that questionnaire dispersed in a pool of 492 items for measuring 45 perso- 
nality traits using the IPIP-45AB5C in two research acts, separated by a two- 
week interval. The second test of invariance was carried out between the tradi- 
tional paper-and-pencil research method and online research. 

Measurement invariance was verified in the procedure of multigroup confir- 
matory factor analysis (MGCFA; Cieciuch & Davidov, 2014; Vandenberg & 
Lance, 2000). Three levels of invariance were tested: configurational, metric, and 
scalar. According to Chen’s (2007) proposal, metric measurement invariance in 
groups of N > 300 is considered acceptable when ACFI < .01, ARMSEA < .015, 
and ASRMR < .03 between the configurational and metric levels. For scalar 
invariance, Chen (2007) proposes the following threshold values: ACFI < .01, 
ARMSEA < .015, and ASRMR < .01 between the metric and scalar levels. 

Table 6 presents model fit indices for each of the three levels of invariance in 
the two tests carried out. It turned out that in all conditions the measurement was 
invariant at the configurational, metric, and scalar levels, since CFI, RMSEA, 
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and SRMR delta values all fell within the acceptable range. This means that the 
IPIP-BFM-50 questionnaire is unaffected by various research conditions and 
yields results undistorted by the specificity of these conditions. 


Table 6 
Model Fit Indices in Multigroup Confirmatory Factor Analysis, Testing IPIP-BFM-50 Measure- 
ment Invariance in Different Research Conditions 


Invariance level aj df CFI RMSEA SRMR 


The IPIP-BFM-50 questionnaire versus IPIP-BFM-50 questionnaire items in a pool of 492 items 


Configurational 2305.634 160 936 „053 [.051-.055] „053 
Metric 2344.430 170 935 052 [.050-.054] „053 
Scalar 2502.617 180 930 „052 [.050-.054] „053 


Paper-and-pencil versus online study 


Configurational 3363.859 160 939 053 [.051-.055] „053 
Metric 3407.622 170 938 052 [.050-.058] „053 
Scalar 3598.179 180 935 052 [.050-.058] „054 


Note. ln testing invariance between different arrangements of IPIP-BFM-50 items, the groups consisted of 
N = 3732 and N = 936 participants, respectively; in testing invariance between the online and offline versions, 
the groups consisted of N = 2347 and N = 4668 participants, respectively. 


External Validity 


Correlations with other Big Five measures 


Table 7 shows correlations of IPIP-BFM-50 scales with NEO-PI-R (the first 
and the second study) and NEO-FFI scales (the seventh study). The correlation 
coefficient values obtained confirm the theoretical validity of the IPIP-BFM-50. 
Correlations between corresponding scales were considerably higher than the 
remaining ones. The highest (negative) correlations were found between IPIP- 
-BFM-50 Emotional Stability scales and the Neuroticism scales of the NEO-FFI 
and the NEO-PI-R. Rather unexpectedly, correlations were the lowest not in the 
case of Intellect and Openness to Experience scales but in the case Agreeableness 
scales. Especially the correlation between these scales in the IPIP-BFM and their 
counterparts in the NEO-PI-R was much lower than expected. This may be due 
to certain differences in the conceptualization of the Agreeableness factor be- 
tween the lexical and questionnaire traditions. This is not only about the above- 
mentioned warmth, which is part of Agreeableness in the lexical tradition and an 
facet of Extraversion in the questionnaire tradition. Agreeableness in the ques- 
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tionnaire tradition contains more aspects connected with modesty and morality 
than the corresponding factor in the lexical approach (cf. Ashton & Lee, 2005). 


Table 7 
Values of Pearson’s r Coefficient of Correlation Between the IPIP-BFM-50 and the NEO-FFI 
(N = 782) as Well as the NEO-PI-R (N = 685) 


Scales Extraversion Agreeableness Conscientiousness p RA Intellect 
Extraversion 678% 3287 „01 253% 34At 
z Agreeableness .06 51** 11* .24** -.09* 
S Conscientiousness A5** „LEE .69** .26** .09* 
a Neuroticism -.35** -.17** -.19%* -.70** -.26** 
Openness 23% ZATE -.04 .09* 56** 
mm rt a ee ee ae « 
m Agreeableness -.12** .47** A2%% „05 -.14 
a Conscientiousness .08* 13 ** .61** 18** 07 
z Neuroticism -.25 -.04 -.22 -.65 -.14 
Openness 22** 3T%* „02 .03 55** 


Note. * p < .01; ** p <.001. The correlation coefficients between corresponding scales are in bold. 


Differentiation 
of personality traits by gender 


Comparisons of scores obtained by women and by men were performed on 
the entire sample, after the confirmation of scalar measurement invariance (Table 
8). The greatest gender differences occurred in the case of Emotional Stability 
and Agreeableness. Women showed a considerably higher level of Neuroticism 
and Agreeableness than men. That result is fully consistent with the research 
conducted using the NEO-PI-R and the NEO-FFI both in Poland (Siuta, 2006; 
Zawadzki et al., 1998) and in the United States (Costa & McCrae, 1992). On the 
other hand, a higher level of Intellect was shown by men, which runs contrary to 
studies using NEO questionnaires. In those studies, it is women that showed 
higher Openness (Siuta, 2006; Zawadzki et al., 1998), although it must be said 
that this was mainly due to such aspects of this factor as aesthetics or feelings, 
which are more weakly represented in the lexical Intellect. When it comes to 
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gender differences in Conscientiousness and Extraversion, the results of studies 
using NEO questionnaires are not so clear. Research using the IPIP- 
-BFM-50 showed a higher level of Conscientiousness in women, but no signifi- 
cant differences in Extraversion were found. 


Table 8 
Differences in Means Between the Groups of Women (N = 3747) and Men (N = 3327) 
Women Men 
Scale t(7072) 
M SD M SD 

Extraversion 3.36 0.77 3.34 0.77 0.82 
Agreeableness 3.90 0.59 3.80 0.62 1.29% 
Conscientiousness 3.48 0.66 3.43 0.66 D2 
Emotional Stability 2.97 0.80 3.14 0.79 -8.81** 
Intellect 3.57 0.59 3.60 0.58 -2.44* 


Note. ** p < .01; * p < .05. 


Relationships between personality traits and age 


The following tendencies were found: Conscientiousness (r = .26), Emotion- 
al Stability (r = .08), and Agreeableness (r = .02) increase while Intellect 
(r = -.18) and Extraversion decrease with age (r = -.06). These findings are fully 
consistent with those obtained using NEO questionnaires in Poland and in the 
United States (Costa & McCrae, 1992; Siuta, 2006; Zawadzki et al., 1998), 
although values of the correlation coefficient are rather low. 


* 


Many instruments have been developed for measuring the basic five perso- 
nality traits (cf. De Raad & Perugini, 2002). Apart from commercial inventories, 
such as Costa’s and McCrae’s (1992) NEO-PI-R and NEO-FFI (NEO Five- 
Factor Inventory), designed not only for scientific research but also for individu- 
al assessment, noncommercial questionnaires designed only for research purpos- 
es have been gaining popularity in recent years as well. One of them is Gold- 
berg’s (1992) IPIP-BFM-50 for measuring five personality traits as identified in 
the lexical tradition. 

The paper presents the Polish adaptation of this questionnaire. The data 
subjected to analyses was collected from 7015 individuals in eight studies. The 
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IPIP-BFM-50 has good psychometric properties. Its reliability, verified in the 
analysis of Cronbach’s alpha values, was found to be satisfactory. The question- 
naire’s five-factor structure may be regarded as satisfactorily confirmed in con- 
firmatory factor analysis. The IPIP-BFM-50 turned out to be a measure unaf- 
fected by different research conditions, such as online versus offline study or 
studies with different arrangements of test items (arrangement of items as in the 
questionnaire versus questionnaire’s statements mixed with other statements). 

The IPIP-BFM-50 questionnaire is not subject to any usage restrictions in 
scientific research. As all the questionnaires from the IPIP project, it may be used 
free of charge in any form, paper or online. The advantages of the IPIP-BFM-50 
presented above show that this measure can be used in the currently studied and 
discussed areas (cf. Strus & Cieciuch, 2014). This is due not only to good psy- 
chometric properties and no usage restrictions but also to the fact that it is 
a questionnaire measuring the lexical version of the Five-Factor Model. The 
IPIP-BFM-50 may also be said to be, in some sense, a kind of synthesis of the 
two traditions that have been instrumental in the emergence and development of 
this model. 
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